swap regex backend to librure (rust-lang/regex c abi) — 27x faster than posix regex.h#574
Merged
swap regex backend to librure (rust-lang/regex c abi) — 27x faster than posix regex.h#574
Conversation
Contributor
Benchmark Results (Linux x86-64)
CLI Tool Benchmarks
|
…bi) — 27x faster on the regex_match microbench (51ms posix → 1.9ms librure). same exported cs_regex_* symbols so codegen layer unchanged. linear-time guarantee (no redos), js-shaped unicode by default. ci installs rustup; release tarball ships prebuilt librure.a so end users dont need rustc. binary cost: regex-using binaries grow 263kb to 4.1mb, non-regex binaries unchanged.
… 100k objects. regex_match is the workload the librure swap was motivated by; map_lookup exercises hash-keyed lookup at realistic scale (100k entries, 1m gets); json scaled up so yyjson's parser/serializer have enough work to be measurable. all three benches include c, go, chadscript, node implementations producing matching outputs (verified locally).
…e prepared statements + bound params for apples-to-apples vs go's database/sql layer (c jumps 0.347s to 0.032s = 3m qps), add regex_match+map_lookup to assemble_json.py meta block (json desc updated to 100k), install rustup in update-benchmarks.yml workflow + bump cache key for librure vendor build.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the POSIX
<regex.h>regex backend inc_bridges/regex-bridge.cwithlibrure— the C ABI for Rust'sregexcrate (rust-lang/regex). Same exportedcs_regex_*symbols, so the codegen layer is unchanged.Why
POSIX
regex.his a 1980s backtracking NFA: no JIT, no SIMD, no DFA caching. On a regex-heavy workload (100k matches, anchored pattern with one capture group) it took 51 ms. The new backend lands the same workload in 1.9 ms — a 27× speedup.librureis a stable C ABI with no transitive deps beyond libc (+Security/CoreFoundationon macOS, which rustc statically requires).Speed
Three runs: 1.91 ms, 1.83 ms, 1.86 ms. Same hit count (correctness preserved).
Side benefits
RURE_FLAG_UNICODEis on. Closer to JavaScript regex semantics than POSIX.Cost
cargo(one-timerustupinstall). End users installing via the release tarball receive a prebuiltlibrure.aand never need rustc.Files
c_bridges/regex-bridge.c— full rewrite, same exported symbolsscripts/vendor-pins.sh— pinnedRUST_REGEX_TAG="1.11.1"scripts/build-vendor.sh— adds librure build step (clones +cargo build --release+ copieslibrure.a/rure.h)scripts/build-target-sdk.sh— packageslibrure.ainto target SDKssrc/compiler.ts+src/native-compiler-lib.ts— linklibrure.a(+ macOS frameworks) whenusesRegex.github/workflows/ci.yml+cross-compile.yml— install rustup; bump cache key tovendor-rure1-*; addlibrure.ato lib verify list and release packagingBUILDING.md— documents Rust as contributor-only build depVerification
All 4 existing regex fixtures pass with the new backend:
regex-character-classesregex-constructorregex-execregex-exec-dynamic(exercises chad-shapestring[]return path)Out of scope (follow-ups)
librure.aABI compat with musl needs verificationrustup target add aarch64-unknown-linux-gnuin cross-compile.yml; TODO comment addedlibrure.aper-arch fetched from GH Releases instead of built via cargo (drops the rustc requirement entirely for vendor builds)Test plan